28 research outputs found

    Executing Large Scale Scientific Workflows in Public Clouds

    Get PDF
    Scientists in different fields, such as high-energy physics, earth science, and astronomy are developing large-scale workflow applications. In many use cases, scientists need to run a set of interrelated but independent workflows (i.e., workflow ensembles) for the entire scientific analysis. As a workflow ensemble usually contains many sub-workflows in each of which hundreds or thousands of jobs exist with precedence constraints, the execution of such a workflow ensemble makes a great concern with cost even using elastic and pay-as-you-go cloud resources. In this thesis, we develop a set of methods to optimize the execution of large-scale scientific workflows in public clouds with both cost and deadline constraints with a two-step approach. Firstly, we present a set of methods to optimize the execution of scientific workflow in public clouds, with the Montage astronomical mosaic engine running on Amazon EC2 as an example. Secondly, we address three main challenges in realizing benefits of using public clouds when executing large-scale workflow ensembles: (1) execution coordination, (2) resource provisioning, and (3) data staging. To this end, we develop a new pulling-based workflow execution system with a profiling-based resource provisioning strategy. Our results show that our solution system can achieve 80% speed-up, by removing scheduling overhead, compared to the well-known Pegasus workflow management system when running scientific workflow ensembles. Besides, our evaluation using Montage workflow ensembles on around 1000-core Amazon EC2 clusters has demonstrated the efficacy of our resource provisioning strategy in terms of cost effectiveness within deadline

    PM2.5-GNN: A Domain Knowledge Enhanced Graph Neural Network For PM2.5 Forecasting

    Full text link
    When predicting PM2.5 concentrations, it is necessary to consider complex information sources since the concentrations are influenced by various factors within a long period. In this paper, we identify a set of critical domain knowledge for PM2.5 forecasting and develop a novel graph based model, PM2.5-GNN, being capable of capturing long-term dependencies. On a real-world dataset, we validate the effectiveness of the proposed model and examine its abilities of capturing both fine-grained and long-term influences in PM2.5 process. The proposed PM2.5-GNN has also been deployed online to provide free forecasting service.Comment: Pre-print version of a ACM SIGSPATIAL 2020 poster [paper](https://dl.acm.org/doi/10.1145/3397536.3422208). The code is available at [Github](https://github.com/shawnwang-tech/PM2.5-GNN), and the talk is available at [YouTube](https://www.youtube.com/watch?v=VX93vMthkGM

    High photocatalytic activity of Cu2O embedded in hierarchically hollow SiO2 for efficient chemoselective hydrogenation of nitroarenes

    Get PDF
    Photocatalytic organic conversion is a crucial process in the hydrogenation of nitroarenes, but harsh reaction conditions such as long reaction time, high hydrogen pressure, and organic medium still need to be considerably overcome under visible-light irradiation. Here, we have constructed a transition metal oxide photocatalyst by embedding low-cost Cu2O with strong visible-light absorption into hierarchically hollow SiO2 sphere (SiO2-Cu2O@SiO2) that can suppress the escape of photogenerated atomic hydrogen and promote the contact probability between hydrogen atom and nitroarene molecules due to confinement effect. Remarkably, the SiO2-Cu2O@SiO2 photocatalyst can exhibit efficient chemoselectivity toward the hydrogenation of various nitroarenes in an aqueous system at ambient conditions, successfully working out the requirement of strict hydrogenation conditions, especially for organic medium over almost all of the reported photocatalysts. Notably, quantitative aniline can be produced for the visible-light catalytic reduction of nitroarenes, suggesting a considerable potential for industrial applicatio

    Executing Large-Scale Workloads in Public Clouds

    No full text
    In this research, we systematically study the execution of large scale workloads in public clouds. We analyze the microeconomics behavior of the global enterprise computing resource market. The analysis results clearly indicate that public clouds represent the future of the enterprise computing resource market. To address the challenges involved in migrating from traditional computing resources to public clouds, we develop a set of methods to optimize the execution of large scale workloads in public clouds, using scientific workflow, quality of service, and video transcoding as examples. Furthermore, we study the limit of the horizontal scaling technique in public clouds by identifying sources of cloudscale bottlenecks, then quantitatively measuring their impact on the capacity of horizontally scalable applications. We start with an analysis on the price elasticity of demand of the global enterprise computing resource market, including the server sales business, server rental business, and public clouds. We reveal that from a microeconomics point of view public clouds are fundamentally different from traditional server sales business and server rental business. The analysis results clearly indicate that public clouds represent the future of the enterprise computing resource market. To address the challenges involved in migrating from traditional computing resources to public clouds, we develop a set of techniques and tools to optimize the execution of large scale workloads in public clouds, using scientific workflow, quality of service, and video transcoding as examples. We present DEWE v3, a workflow management system with function-as-a-service (FaaS) as the target execution environment. DEWE v3 reduces the effort needed to execute large-scale scientific workflows. It liberates scientist from the tedious administrative tasks involved in the traditional cluster approach, allowing them to focus on their own research work. We present the design and implementation of Janus - a generic and scalable QoS framework for admission control purposes. We demonstrate that Janus achieves linear scalability both vertically and horizontally. We also use a photo sharing application to demonstrate that Janus can be used to provide QoS service for a wide range of SaaS applications. We analyze the challenges involved in large-scale video transcoding, leading to the design and implementation of our own video transcoding system. Large-scale evaluations indicate that the improved application maintains linear horizontal scalability at 10,100 vCPU cores. To understand the limit of public clouds, we develop ScaleBench as a distributed and parallel benchmark framework. ScaleBench is capable of generating substantial and sustainable workload on public clouds, simulating the resource consumption pattern of various horizontally scalable applications. In particular, it detects cloud-scale bottlenecks in compute unit, block storage, networking and object storage. In addition, we use a real-life video transcoding application to demonstrate that the horizontal scaling technique can fail to gain more capacity when such cloud-scale bottlenecks are reached. We propose the concept of capacity degradation index (CDI) to describe the degree of capacity degradation at scale. We perform extensive empirical studies on four public clouds. We observe significant capacity degradation in three of them. This confirms that on multiple public clouds cloud-scale capacity bottlenecks not only exist, but also can be easily detected by an ordinary cloud user. With as little as 20 worker nodes, we observe up to 24%, 52%, 14% and 90% capacity degradation in overall system performance, block storage, networking and object storage, respectively. We conduct large-scale experiments using a real-life video transcoding application, where the largest worker fleet utilizes 3200 vCPU cores. We demonstrate that when the above-mentioned cloud-scale bottleneck is reached the capacity of the horizontally scalable application stops growing regardless of the growth in the number of nodes

    Virtual Machine Performance Comparison of Public IaaS Providers in China (DRAFT)

    No full text
    Abstract -We compare the virtual machine (VM) performance of two public IaaS providers (GrandCloud and Aliyun) in China. UnixBench and Hadoop wordcount are utilized to provide benchmark data for the comparison. It is found that VM specifications such as the number of CPU cores and the amount of memory can no longer be used as reference for VM performance. In both UnixBench and Hadoop wordcount tests, as the VM gets bigger, the performance / price ratio gets lower. In Hadoop wordcount tests, a cluster with several smaller VMs provides much better performance as performance / price ratio as compared to a bigger VM. It is recommended to practice horizontal scaling when an application needs more computing resource

    Multiple homoclinic solutions for superquadratic Hamiltonian systems

    No full text
    In this article we study the existence of infinitely many homoclinic solutions for a class of second-order Hamiltonian systems u¨−L(t)u+Wu(t,u)=0,∀t∈R, \ddot{u}-L(t)u+W_u(t,u)=0, \quad \forall t\in\mathbb{R}, where L is not required to be either uniformly positive definite or coercive, and W is superquadratic at infinity in u but does not need to satisfy the Ambrosetti-Rabinowitz superquadratic condition

    Using N-doped Carbon Dots Prepared Rapidly by Microwave Digestion as Nanoprobes and Nanocatalysts for Fluorescence Determination of Ultratrace Isocarbophos with Label-Free Aptamers

    No full text
    The strongly fluorescent and highly catalytic N-doped carbon dots (CDN) were rapidly prepared by a microwave irradiation procedure and were characterized by electron microscopy (EM), laser scattering, infrared spectroscopy (IR), and by their fluorescence spectrum. It was found that the CDN had a strong catalytic effect on the fluorescence reaction of 3,3′,5,5′-tetramethylbenzidine hydroxide ((TMB)⁻H2O2) which produced the oxidation product of TMB (TMBOX) with strong fluorescence at 406 nm. The aptamer (Apt) was adsorbed on the CDN surfaces which weakened the fluorescence intensity due to the inhibition of catalytic activity. When the target molecule isocarbophos (IPS) was added, it reacted with the Apt to form a stable conjugate and free CDN which restored the catalytic activity to enhance the fluorescence. Using TMBOX as a fluorescent probe, a highly sensitive nanocatalytic method for determination of 0.025⁻1.5 μg/L IPS was established with a detection limit of 0.015 μg/L. Coupling the CDN fluorescent probe with the Apt⁻IPS reaction, a new CD fluorescence method was established for the simple and rapid determination of 0.25⁻1.5 μg/L IPS with a detection limit of 0.11 μg/L
    corecore